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FOREWORD 



A salient component of current reforms and restructuring of educational 
^sterns in countries of Asia and the Pacific is the modernization of educa- 
tional planning and management approaches and mechanisms. The introduction 
of applications of new information technology, such as the computer, is fast be- 
coming a major thrust in the region. 

The productive and service sectors have been pioneers in the application 
of computer technology to management, through the development of what is com- 
monly called (MIS) (Management Information System). This approach is spreading to the 
education sector. 

The education sector remains one of the biggest employer within the national 
economy. It usually caters to a very large client group: the learners, who comprise 
easily more than half of the total population, particularly those below 25 years of age. 
Within the national budget, the education sector also operates huge financial resources 
which very often are second in size only to defence. The number of physicol 
facilities such as school buildings, furniture, textbooks and other materials ac- 
quired, operated, and maintained is not only vast, but also very much dispersed within 
the national territory. 

Also taking into account the rapidly growing non-formal and informal streams of 
education, it is clear thot the entire planning and management tasks would be insur- 
mountable, unless good use is made of modern management approaches and 
tools. This has given birth to o new area of development, namely that of Educa- 
tional Managerrient Information System (EMIS), which deals with the introduction 
of modern information management techniques and applications for the improvement 
of educational planning and management. 

The past few years have witnessed the rapid proliferation in countries of 
the region of a new generation of computer: the microcomputers. Techno- 
logical progress in microcomputer hardware and software capabilities has given rise 
to a wide panoply of user-specific automated opplications. There is the the growing 
need to draw from experiences that have been accumulated thus far to create 
and adapt similar facilities for the management of education. The Educational 
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Planning. Management and Statistics Unit of the UNESCO Principal Regional Office for 
Asia and tfie Pacific has therefore commissioned the present study as the first in its 
on-going efforts to contribute to the sharing of expertise, experiences, and ideas in 
EMIS. 

UNESCO owes special thanks to the author: Mr. Quo Sheng of the Education 
Commission of Guangdong Province of the People's Republic of China, for docu- 
menting in this study his valuable experiment and experience in thp processing 
on 0 sun^ey on the physical condition of students using microcomputers. It is hoped 
that this study will provide new insights and new impetus to other countries in the 
development of microcomputer-based applications in EMIS. 



Educational Planning, Management 
and Statistics Unit, PROAP 
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Chapter One 
INTRODUCTION 



The present study sumnfiarizes the methodological approaches adopted as well 
as experiences gained in the course of a sample survey launched in 1985 in Guang- 
dong province, China. The survey was undertaken to assess the level of stu- 
dents' physical condition and health in Guangdong province. 

In China, it has long been considered important to monitor and observe 
the physical development oi children and youths and the patho-genesis of common 
ailments over a long period of time. Through a study of the physical characteristics 
and health of students from different regions, sexes and minority nationalities, 
this survey aims at providing a scientific basis and guidance to the expansion and 
improvement of physical culture and hygiene in schools. 

In accordance with the national plan of China, Guangdong province estab- 
lished in 1983 the system of Chinese Student Physical Condition and Health Record 
Card at every level and in al! kinds of schools. Observation points were in- 
troduced which regularly monitored and carried out research and studies on changes 
in the students' physical condition and health, from 1983 to 1986. The present 
sample survey on the students' physical condition and health was launched to fur- 
ther strengthen the information base for the establishment of better standards 
of health services. 

This survey was characlerized by a number of innovative approaches. The first 
was the use of sampling lechniques, instead of the usual full-scale census report- 
ing, for the collection of information within the education sector. The use of 
microcomputers to handle the entire processing and analysis of data was an- 
other major innovation, which proved to be very positive ond valuable. There is 
no doubt that this experience will contribute to the further development and expan- 
sion of microcomputer-based data processing in education. 

A third innovative aspect encompasses the wide range of original applica- 
tions developed for data input, verification and analysis, using a common spreadsheet 
software package, the LOTUS 1-2-3. The brilliant ideas behind these applica- 
tions have opened up new avenues not only in the current modernization and 
microcomputerization of educational planning ond management, but also survey 
data processing in general. 

As a methodological contribution, this study shall not attempt to summarize 
and present the survey findings, but rather the methods, techniques and soft- 
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ware applications employed during the survey data processing, as well as the experi- 
ence gained through their application. II is hoped that this experience will lead to 
other new and better ideas, approaches, methods and techniques, thus generating 
further developmental work in related applications in other countries. 

The present study contains six chapters, beginning with the Introduction 
which gives the background to the sample survey. The remaining parts are 
organized in ^,uch a way that the scope and coverage of the survey, as well as the 
sampling techniques adopted are described in Chapter Two. Chapter Three explains 
in detail the methods used in survey data processing by microcomputer withir the 
LOTUS 1-2-3 software environment. Chapter Four summarizes the findings 
from the methodological point of view. Chapter Five attempts to hiahliaht some of 
the problenfis encountered c'uring this experiment, and some of the solutions 
adopted to solve them. And the last chapter draws salient conclusions and suggests 
future directions of development. 

A number of tables and charts are included in appropriate places to illustrate 
the approach and results. In the Annex, a sample of the Chinese Student 
Physical Condition and Health Card is included, along with listings of two LOTUS 
1-2-3 macro programs used. 
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Chopter Two 
SURVEY METHODS 



Survey Grouping 

Respondents to this survey were drawn from university and college students as 
well as secondary and primary school pupils aged from 7 to 22. The age-group 7 to 12 
corresponds to primary education in China, 13 to 18 to secondary schooling, and 19 
to 22 to college and university education. 

For primary and secondary levels, an urban and a rural area that were represen- 
totive in terms of students' phyi'cal conditions and health were selected from each 
prefecture and municipality in Guangdong province. The selection of these areas was 
also bQse6 on economic and geographical factors. Alwut 120 male and female students 
from every age-group, in both the urban and rural areas, were surveyed. Students in 
the Hainan Li and Miao Autonomous Prefectures were classified according to Han or 
Li nationalities, instead of the urbon/rural criterion. Three high schools and three pri- 
mary schools from areas with moderate living standards and sports facilities were ran- 
domly sampled in each of the selected areas. 

For undergraduates in colleges and universities, three higher educational 
institutions which could provide data on the state of health of students born in Guang- 
dong province were selected. Colleae students from each specific age-group, number- 
ing 200 each for either sex from tne urban areas, and 100 each from rural areas, were 
surveyed. 

The survey respondents were required to be in good health and not to have any 
physical deformity. Students in special "chools or classes, in particular in physical 
culture or art colleges, were not covered by the survey. 

In addition to groupings that separated urban and rural students, information 
on groups which combine these two sections of data for the same age-group ond 
sex were also collected. The sample data of Guangdong provided dato from large 
cities, mountainous areci, and coastal areas. 102 cards from each segment for each 
single group were sampled, so there were 306 cards for every group in the provinciol 
sample. 

Students included in the survey were classified according to six types: Han and 
Li nationalities, male and female, urban and rural. About 70,000 valid cards needed to 
be processed. Considering that 46 data items or indicators were included in each 
card, more than three million data items were processed. Microcomputers of the IBM- 
PC/XT type were used for this purpose. 
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Data items 



Each sfudeni was studied according to four types of indicators: size, function, 
diathesis, and health examination. 

In the physical measurement of the body, the size indicators were used to study 
the relationship among growth, speed and body development. The six size indicators 
used during the survey were height, weight, chest circumference, sitting height, shoulder 
breadth, and pelvic breadth. 

Functional indicators reflected physical constitution in terms of the function- 
ning of organs. These indicators includoc* pulse, systolic pressure, diastolic pressure, 
vital capacity and others. 

Diathesis indicators were used to evaluate the speed, strength, and flexibility of 
the students by checking their sportive capabilities. Major diathesis indicators included 
nine test items, such as the 50 metre sprint, standing long jump, chin-up, inclined - 
body chin-up. sil-iips. and so on. 



Figure 1 . Items excrJned in (he romp/e w/vey by /e\/e! of educofion 
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The health examination sought to identify any pathological changes in the Iwdy. 
Vision, heart, lung, liver and the functionning of other organs were also examined. 

Different items were examined in the case of undergraduates, secondary 
school students, and primary school pupils, due to the different sportive adopta- 
bility among groups and sexes (Figure 1). For instance, 1,000-metre run was given 
as a test item to secondary school male students, 800-metre run for girls, and 50- 
metre X 8 run for boys and girls in primary schools. 

Survey procedure 

After the sample was determined and identified, survey interviewers were trained 
and survey sites and equipment prepared. The survey was fornf^olly carried out from 
April to June 1985 throughout the province of Guangdong. 

During the survey, survey personnel filled out the Chinese Student Physical Con- 
dition and Health Record Card (see Annex I) for each student covered in the survey, 
based on designated specificotions. 

To ensure the accuracy of the data, a random sample of some of the students 
who had already been surveyed was chosen to repeat the tests. The new results were 
compared with the original data. When differences surpassed a certain permissible 
limit, processing of the survey was stopped until the cards were corrected. 

Once data collection and verification were completed, the record cards were 
sorted according to sex, urban/rural, age-group and nationality, and then filed. 
During the course of card transfer, every item in every cord was checked repeat- 
edly. Erroneous cards were corrected and missing cards were added. In this way, 
reliability of data was improved. The number of valid cards that eventually under- 
went computerized processing totalled more than 70,000. 
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Chapter Three 

METHODS OF DATA PROCESSING 

WITH MICROCOMPUTER 



Crude and derived indicatcrs 

There were two pages to each card : the first contained the size, functional 
and diathesis indicators: and the second showed results of the physical examination, a 
list of common ailments and laboratory test results. As the survey covered pupils and 
students who have no physical atinormalies, only data for the first 22 crude 
indicators on the first page of the card were entered into the computer for proces- 
sing. 

Indicators 1 to 20 were quantitative data used for computational analysis. 
Indicators 21 and 22 were logical data used for sorting, counting and grouping. 
Each of them had two possible values: 0 or 1. For item 21, 0 indicoted that a girl has 
not had menstruation yet; 1 for yes. For item 22, signified that the student has 
normal vision {vision index greater than or equal L 1), and 1 for hypopsia (vision 
index smaller than 1). 

Through various ccmbinations and further computations of different crude 
indicators, derived indicators were obtained, which reflected ratio relationships between 
various components of health and physical condition within the human body. For 
example, the Livi index, calculated by taking the percentage chest circumference 
divided by height then multiplied by 100, showed the nutritional status of a student. 
A total of 24 derived indicators were computed in preliminary survey data processing. 

Statistics 

Basic stotistics computed were as follows: maximum and minimum values, 
overage, standard deviation and standard error, variance and percentiles (3%, 5%, 

10% 90%, 95%, 97%) of every crude and derived indicator by group. Each indicator 

in every group was also checked for normal distribution. Absolute increases, growth 
rates and adjusted growth rates, as well as standard ratios giving proportions of every 
age to a specific age of each indicator were also calculated to investigate the 
development and growth of students as age changes. Some of the differences 
between rural and urban, male and female, Han and Li Nationalities, were also 
compored for the some indicotor ot the some age. 
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When the data had been sorted and camputed, linear regressian madels with 
single and multiple variables were built ta study the carrelatian between different 
indicatars. Ihese madels enabled the setting up af standards far the evaluatian af 
the students' physical canditian and health. This was particularly important far the 
study af functional and diathetic characteristics among different body configurotions. 

Data processing hardware and software 

Processing of the sample survey on physical condition and heolth was carried 
out using two IBM-PC/XT microcomputers at the Computer Center of the Guangdong 
Province Department of Education. Both microcomputers hove o 10MB fixed 6si and 
0 360KB disk drive. One has 640KB randam-access-memary(RAM) and the other 
512KB. As there were many IBM-PCs and compotible computers in Guangdong 
province, it was possible to moke use of external help from other deportments 
equipped with similar microcomputers to carry out and complete the data entry. In 
the future, data transfer to and from the mainframe will also be passible. 

The LOTUS 1-2-3 software package was used for data processing, owing mainly 
to its powerful functions which integrate electroni: spreadsheet, dotobose management 
and graphics, lis capabilities for data moiiipuiotion, cofculotions, analysis, tabulation end 
graphical presentation, as well as th? many statistical functions offered, mode it the 
ideol choice for the task at hand. Furthermore, it incorporates macro programming 
language for automating repetitive procedures through o series of sequential 
commands. The experiment was therefore mode to process and analyze the entire 
sample survey data through the use of LOTUS 1-2-3. 

Data structure 

The maximum space in o single LOTUS 1-2-3 worksheet consists of 2,048 rows 
ond 256 columns of data (new versions provide for 8,192 rows). Most PCs hove about 
512KB of memory. Excluding at least 100KB used by the system, there were 
approximately 300KB to 400KB of memory left for the worksheet itself. As o result, 
only 0 limited quantity of data could be processed at o time. 

In view of the more than 3 million crude and derived data in the survey, it 
wos neither passible nor necessary to put oil the data in o single data file. In 
accordance with the set objectives of survey data processing and analysis, the doto 
were classified, inputter' and stared separately in o number of LOTUS worksheet 
files, and processed subsequently by group. 

The data file names were determined according to nationalities, urbon/rurol 
oreos, sex and age. By letting U stor.d for urban, R for rural, M for mole, F for 
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female, H for Han nniionality, and L foi Li nationality, rules for file names were as 
follows: 

lu/RlM/FlflOel 
iL/HlH/Fliigel 

There were consequently eight kinds of file names: UM, UP, RM, RF, HM, HP, 
IH, r For instance, UF12 represented the group of 12-year old girls from municipali- 
ties Of ioicns. 

Cards in every group were numbered in ascending order, starting with 1. This 
altowed data to be checked or verified conveniently later. Care was also taken to ensure 
that ull the groups had roughly the same number of cards. Programs designed for 
processing and anahysis could then be standardized. 

By making use of the database management functions of LOTUS 1-2-3, 
worksheets were created, with data organized in such a way that rows and columns 
contained related information. In a typical worksheet, the first row indicated the file 
name of the group. In the second row were field names in the sequence as de- 
scribed in the survey cards. An additional first column was added to record the card 
f lumbers. The field names were labels in the form of numbers corresponding to the 
data items on the card. Field name 1 therefore stood for the first data item 
examined, namely "pulse"; field 2 represented the second data item, ^'systolic 
pressure"; and so on. 

The database records began from the row immediately below the field names. 
Eoch record was a row in the worksheet showing the physical condition and health 
information of a student. Each field was a column that contained data of the same 
indicator in the group (see Figure 2). Each worksheet was therefore organized as a 
database. For example, the cell located in position D7 represented the intersection 
of the fifth record and the second data item - "systolic pressure". It indicated that 
the systolic pressure of the fifth student in the group UF12 was 92. 

The data structure of the derived indicators was similar to that of the crude 
indicators. However, the field names in the second row of the worksheets were given 
in the form of characters, such as A, B, C, .., X. 

Certain crude indicators were not examined for some specific age-groups. In 
designing the data entry worksheet, the corresponding fields were protected so that 
data could not be entered into those cells. This reduced data entry errors. The 
procedures followed were as follows: firstly, to specify the entire data input range 
by the /Range Unprotect command; then to use the /Range Protect to mark the 



Microcompufer processing - o cose sfu(fy s 



A B C 0 E F G ..... X 




1 uf 12 



No. 


I 1 


2 


3 


4 


5... 


. 22 


1 


38 


110 


60 


40 


1690 


1 


2 


46 


128 


80 


70 


2340 


1 


3 


43 


96 


60 


50 


2330 


0 


4 


44 


102 


72 


66 


2650 


0 


5 ! 


45 


92 


70 


60 


1980 


0 


6 


39 


90 


56 


50 


2180 


0 


306 


42 


106 


70 


64 


1800 


0 



Figure 2. Sfrucfure of dofo fks 
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columns to prohibit data entry; and finally, use the command /Worksheet Global 
Protection Enable to turn on the protection facility. 

As soon as all the data were organized in the way described above, the 
powerful statistical and data management functions of LOTUS 1-2-3 were put to 
use in processing and analyzing the data in a fast and convenient manner. 

An ovefvtew of survey data processing 

A salient feature of this experiment in microcomputer processing of survey 
data was the use of LOTUS 1-2-3 macro programming facilities for automating 
data input, progressing and analysis. Macros not only reduce repetitive keystrokes 
and command operations, they have also become the building blocks for programs 
incorporating advanced features such as loops, subroutines and conditional branching. 
It is the macros that speed' up commands execution, thus improving the efficiency of 
data processing. 

The worksheets were designed to facilitate application of the macro 
programming technique within LOTUS 1-2-3. A typical worksheet consisted of three 
ranges: data range, formula range and program range (Figure 3). The data range 
was used to incorporate data file for processing. The formula range contained 
mathematical expressions for the calculation of derived indicators and statistics. 
When the jcalcj command in the macro program was encountered, the computer 
would perform certcin specific computations automatically. The program range was 
generally a column in which a series of cell entries recorded the step-by-step 
macro operations. Whenever the mocro was invoked, the computer would 
automatically load the dala file to be processed into the data range, sort and 
reorganize the data, as well as compute them. Finally, the results were saved in the 
diskettes. 

As 10MB of hard disk storage was nol sufficient lo hold all the files, only 
the relevant series of data files were copied inio it for each batch of the 
processing operations. Once the LOIUS 1-2-3 was started, the computation 
worksheet was loaded from a diskette into the main memory, and the macro invoked. 
Data files were then loaded one afler another into the data range. The processing 
subroutine was activated automatically once the data range was filled, so large 
amount of data could be processed repeatedly. In the computational 
worksheet demonstrated in Figure 3, once the data file UF7 had been processed, UF8 
was loaded und the same processing operations carried oul. In a similar manner, 
subsequent data files such as UF9, UflO, .., UF18 were processed.. 

Data processing of the physical condition and health survey could be roughW 
divided into four modules or phases: data prepara*'on, data entry, data analy.>.s 
and results output. They are shown in Figure 4. 
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The task of the data preparation module was to provide quality crude data for 
microcomputerized processing. During this stage, sample points were selected and 
data items examined by trained personnel. The cards were completed and repeat- 
edly checked; then sorted and bound into volumes. File names and card numbers 
were then assigned. 

At the stage of data entry, data were entered into computer storage following 
fixed formats and procedures. A duplex data entry system was employed in which 
data from the same card were entered twice into separate data files for cross- 
checking. Data entry errors were located and eliminated. This was followed by logical 
checks which, according to the age and sex of individual pupils, identified data values 
that either fell short of or exceeded the normally accepted value ranges, for example in 
body height or size. Logical checks based on the correlation between crude indicators 
were also performed to locate erroneous records which were then rectified or replaced. 

The data analysis module was used to sort, calculate and analyze the survey 
data. The first step was to compute derived indicators based on the crude indica- 
tors, in order to generate data files of derived indicators. At different stages of 
the calculation, patterns of student physical development were identified and norms 
for the evaluation of students' health determined. The appropriate reference curve 
graphs were also plotted. 

The results output module was used when analytical results were transferred 
to the printer for hard-copy output of tables and graphs for further study and inter- 
pretation. 

These modules will be described in detail in the following sections, beginning with 
data entry and data analysis. 

Progranrinriing of data entry nriodule 

Before data could be entered into computer storage, worksheets such as the 
one illustrated in Figure 2 were prepared. Dc'.a entry personnel filled out the work- 
sheet cells row by row, from left to right, and from top to bottom. With the duplex 
data entry procedure, the same group of data was entered twice and saved on disk- 
ettes under different file names. 

The COMPARE program was then used to compare duplex data files and locate 
inconsistencies. When these inconsistencies had been checked and corrected, the 
LOGICAL program was executed to make logical checks through the data. Illogical 
records were extracted and either corrected by special personnel or replaced with new 
logical cards with correct and logical data. The design and functions of the COM- 
PARE and LOGICAL programs are described below. 
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Assuming that A and B are the duplex data files of the some group with n 
cords in each group. The data are arranged under LOTUS 1-2-3 as a database in the 
form of a matrix with n rows and 22 columns (as there ore 2z data items in a card). 
By designating A as the matrix ja.j and B as \bJi, where i can vary from 1 to n and j 
from 1 to 22, according to the law of probability, it is rare that exactly the same dote 
item is entered twice with error during the duplex data entry. There is even lesser 
probability that the two erroneous entries are identical. Therefore, if one con ensure 
that o.=b- for all i's from 1 to n and all j's from 1 to 22, the data entries for the spe- 
cific batch of n cards can be considered as complete and free from error. 

The condition of a- = b- is equivalent to a. - b- = 0. 

0. - b. <===> a- - b. = 0 

With the /File Combine Subtract command in LOTUS 1-2-3, we can incor- 
porate the data file B into A in the COMPARE worksheet and subtract every datum in A 
from the corresponding datum in B, to give: 

c- = a- - b- 
1 'I 'I 

To locate the errors, the absolute values of c- are horizontally summed up to 
act as a tag for judging .whether there is any error in the data entry for each record, 
where: 

22 22 
Tog- I |c..|= I |a..-b..| 

i=l j=1 

When Tag; = 0, the ith record is correct. By being different from zero, it indicates er- 
ror. 

The purpose of taking the sum of the absolute values of c. is to eliminate the 
possibility that certain positive and negative values obtained by calculating the 
simple differences between a- and b- may cancel each other if the the horizontal 
sum of c-'s are taken. As a result, one may not be able to detect the existence of 
error in certain records, if the horizontal sums of the c-'s for those records are equal 
to zero due to the cancelling effect. 

In the COMPARE program, the flow of which is illustrated in Figure 5, through* the 
use of macros (please see macro program in Annex II), data file A is loaded into !he 
worksheet and used to check against data on the survey cards at a later stage. The 
matrix A is also duplicated through the /Copy command into another area in the 
worksheet, so that one can subtract B from it with the /File Combine Subtract com- 
mand. As described earlier, the field names and card numbers are recorded in the 
border ranges, so only the data parts of B need to be combined into the corresponding 
range by selecting the /FCS Named- range commands. 
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Figure 5. fhw chorf of Compore 
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After calculations of individual c-'s (= a. - b.) are completed, their horizontal 
sums are obtained and another database with a single field tag is set up. Records 
with togs which ate non-zero are identified by using the command /Data Query Find. 

The "window" commond in LOTUS 1-2-3 is used to split the screen into two 
windows for showing two sets of data at the same ti.ne. The upper window provides 
0 view on. the original data of file A. The lower window shows the results of a- - b- 
ond the tags. Once o non-zero value is detected for the tag, the corresponding 
value in A is checked against the cord records to locate the error. Three types of 
errors may arise when c- = 0: (a) only entries in 6 are wrong; (b) only entries in A 
ore wrong; or (c) both entries in A and B are wrong. For the first case, nothing 
needs to be done. For the second and last cases, modifications are made in the ap- 
propriate entry and data item. 



Fig ure 6. LMfs for /ogico/ check 
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Certoin logical relationships existed among the indicators for specific oge 
groups. Some of the maximum and minimum values of measurement are 
given in Rgure 6. Any datum that exceeds the maximum value or falls short of the 
minimum value was checked again for logical errors. For example, the pulse range 
for children of the 7 to 12 age-group usually varied from 30 to 60 times per half- 
minute. If the pulse of a 10-year old child was found to be less than 30 or greater 
than 60 times per half-minute, the record of this student was checke d again. 
Nineteen types of error were coded: as EOl, E02, ... , E19. 

For each type of error, a criterion range was set up to specify the error condi- 
tions- For the 7 to 12 age-group, the condition for E06 was that (sitting height/ 
height X 100) should be less than 46.32 and greater than 59.6. As mentioned earlier 
the field names of height and sitting height were in cells H2 and 12. Therefore, the cell 
address of standing lieight in the first record was H3, and sitting height 13. By enter- 
ing the two conditions in the two rowb low the field name *6 in the criterion range, 
the /Data Query Extract command was used to specify the criterion range and the 
whole database as input range, and to copy all records matching either of the condi- 
tions to an output range. 



The flow chart of LOGICAL can be seen in figure 7. When the macro was invoked, 
the data range was cleared and the specified dota file was incorporated in the 
current worksheet. The database range, output range and criterion ranges were 
automatically specified by the macro. The computer would check the nineteen logi- 
cal errors one after another. If logical inconsistencies were found, they were checked 
against the cards and rectified accordingly. The main portion of the program would 
re-execute until no more error was iound. When the entire dato file had been checked 
for logical errors, it was saved replacing the original file. 

Programming of data analysis module 

The data analysis module forms the centre-piece of the complete method- 
ology of microcomputer processing of the sample survey of the students* physical 
condition ond health. 
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This module first calculated the derived indicators from the crude indicotors. 
Through further computations based on both types of indicators, many statisticol 
tables and charts were generated for use in analysis and interpretation. Severol 
programs were designed and incorporated into this module: for data re- 
organization, regression analysis, as well as comparison among different groups. 
Of these programs, GENERAL was used in calculating common statistics; and another 
program known as TABLE was mainly used in making body development 
evaluation tables. 

Tht purpose of GENERAL was to compute the following statistics by group: 
maximum, minimum, mean, standard deviation, standard error, variance and 
percentiles. This program also checked for normal distribution. 

The LOTUS 1-2-3 statistical functions can simplify considerably the compu- 
tation of common statistics. When the range of an argument is given, such as 
c3..c308, one can calculate directly with functions like @count(c3..c308), 
@min(c3..c308), @max(c3..c308), @avg(c3..c308), @std(c3..c308), and so on. 

As soon as a formula for calculating basic statistics of the group was set up, 
all other formulas were filled in the formula range to obtain other statistics in the same 
way, with the /Copy command. 

To calculate the Xth percentile in a list of data with n examined values, the 
method adopted in the survey was to sort these data in ascending order first, 

XI < X2 < X3 < < Xn 

The Xth percentile required was determined by the sorted value located by 
the formula: 

P, = x%»(n + 1) (1) 

For instance, if n=306, P5 = 5% ♦ (306 + 1) = 15, that is. the 15th value in 
the sorted list was the 5th percentile of the indicator. 

As the processing methods for various indicators in the same group were 
identical, loops were used in the program GENERAL to repeatedly sort, one by one, 
the columns of data. As soon as all the data had been sorted, the corresponding rows 
of data were copied into the result ranges for percentiles. 

During the execution of a /Data Sort command, the data range to be sorted 
was defined in the first place, and then one of the indicators was designated as the 
primary sort key. When asked to determine the sort order, A specifies the ascend- 
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ing order and 0 the descending order. The GENERAL program calculated each indica- 
tor independently. By issuing the Go command, the computer immediately sorted the 
data in the assigned order. 

Once the data of eaci indicator had been sorted, the value of D wos derived 
using the following formula: 



By looking up the table, the p value was obtained. Based on this, it was verified 
whether the observed data for the particular indicator followed the normal distribution. 

The flow chart of GENERAL is illustrated in Figure 8. As the first step, the 
recalculation mode was set to manual so that LOTUS 1-2-3 would recalculate the for- 
mulae only when the jCALCj key was pressed. The data file was then loaded into 
the data range and the columns of data by indicator were sorted one by one, 
from left to right. A loop was incorporated to control the number of fields to be 
corted. With the execution of command jCALCj, all formulae were recalculated 
and statistics, such as min and max, as well as 0 values were obtained. The macro 
then copied the values of each ppf-entile. Finally, the computed results were saved 
onto diskette. 

The function of program TABLE was to produce standard summary tables for 
ttie evaluation of body development through the application of multiple regres- 
sion methods. Standard physical proportions were determined by fitting regres- 
sion equations, using height as the dependent variable and weight and chest circum- 
ference as independent variables: H = b^ + b, ♦ W + b^ ♦ CC. 

Before the program TABLE was executed, the data group names, the coeffi- 
cients and constant of the regression equation, as well as the minimum and 
maximum values of weight and chest circumference for that group, were entered into 
specific rslls. 

The formula for computing height was located on the upper left of the table. 
As the macro was invoked, the values of the two independent variables were filled 
with weight increments of 1 kg and chest circumference increments of 1 cm by 
the /Data Fil, command. The starting value wcs the minimum, the steps 1 and the 
stop value was the maximum. The evaluation table, showing height on the basis of 
weight and chest circumference, wos then created by the command /Oota Table 
(see Figure 9). 
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Chopter Four 
FINDINGS 



Statistical analysis by means of microcomputer^processing o^ the survey 
data on physical condition and health, as described above, resulted in many 
graphs and tables. Tables like the one illustrated in Figure 10 for group UFU* were 
obtained by running the program GENERAL. According to the average values of 
different indicators by age, the annual increments as well as rates of growth were 
also derived. 

A salient survey finding can be summarized in the table showing changes in 
height as children grow up year by year. Rgure 11 shows that 15-year oi'd boys have 
the fostest growth in height; and the physical development of urban girls occurs two 
years earlier as compared to urban boys and rural girls. The LOTUS 1-2-3 graphical 
features provide many possibilities for drawing graphs with the /Graph 
command, and for eventual plotting on hard-copy using the printer. The height 
increase and increase rate curves can therefore be illustrated as in Figures 12 and 
13. These enabled the comparison of differences between male and female, 
mountainous and coastal areas, Han and Li Nationalities, Guangdong and other 
provinces and even with other countries. 

Many evaluation standards were determined during the survey data proces- 
sing. For instance, the 80th percentile of weight was taken for the same height as the 
standard weight for that specific height. If one's weight was under 75 per cent of 
the standard, the child was severely under-nourished. When it was between 75 per 
cent and 85 per cent, there was moderate malnutrition. For levels between 85 per 
cent and 90 per cent, only light malnutrition was detected. Normal nutrition level 
would be situated between 90 per cent and 110 per cent. Between 110 per cent and 
120 per cent, the person was overweight. When it was obove 120 per cent, obesity 
was the diagnosis. 

The body development evaluation table for the group UF12. creoted by executing 
the program TABLE, is shown in Figure 14. From the differences between the height 
calculated based on weight and chest circumference and the actual height of a 
student, the physical proportion and development level of a student were determined. 
Evaluation standards were also established on the sports indicators by age and sex. 
which were used to grade students according to their sportive results. By 
means of standards established based on percentiles, students' sportive capocities 
could be evaluated in a more scientific manner. 
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Correlation analysis were carried ou* lo help lo undersland which size indi- 
cators were too high relative to the sports indicators. The study results had been 
useful for choosing athletes by taking into consideration a multiplicity of factors. 
Statistical results obtained for students with normal vision were also compared with 
those for students wiih poor vision. Some interesting results and conclusion were ob- 
tained. 

In short, through the sample survey and the ensuing microcomputer data 
processing, a large amount of salient information was obtained on the physical con- 
dition and health situation of students in the Guangdong Province. The results pro- 
vided a scientific bcc's for the general improvement of the students* health and 
their capacity for sports. 
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Chapter Five 



PROBLEMS ENCOUNTERED 
AND SOLUTION ADOPTED 



As this was the first major experiment in the use of microcomputer to process 
large amounts of survey data within the education sector, a variety of problems 
were encountered at different stages. 

The first problem was related to difficulties in conciliating the hardware 
capabilities of the microcomputer used and the statistical requirements of the 
survey. As described in Chapter Two, constraints of limited main memory and hard disk 
storage were solved by storing data in worksheet files on diskettes and processing 
them by group. 

If a group with 1,000 cards is in the current worksheet and every data cell 
corresponds to a formula, there will be at least 20,000 (=1,000 X 20) formulae to be 
calculated. If one tried to fill all the cells with formulae using the /Copy command, 
the ''Memory Fulf* error message in LOTUS 1 -2-3 *ould appear, because 
formulae take up much more memory space.. Therefore, a solution was to set up 
only one column of formulae, instead of 20 columns, and to copy each column of data 
into the formula range one at a time, for performing the calculations. This reduced 
the total number of formulae to abouM, 000. thus making 512KB of memory 
sufficient (see Figure 15). 

Sometimes there was the need (o combine several indicators of many groups 
to organize a new worksheet for analysis. In order to work out the height-weight 
standards, data of height and weight had to be extracted from each age groups for 
the same sex and then combined. 

At first, each of the data files was combined into the current worksheet 
through the command /File Combine Copy Entire. It was soon found out that the 
data range was too big, resulting in the "Memory Full'* error. Consequently, a range 
name was given to every field in the database with the /Range Name Create 
command, followed by loading any indicator of a group into the worksheet. This 
resulted in greatly reduced memory requirements and the speed of data 
processing increased considerably. 
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DATA 



FORMUU^S 



(1 ) All data are computed simultaneously 




(2) Processing data column by column 
Figure 1 5. Two meihods used in colcuhiing formuhs 
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Problems were olso encountered when moving the cell pointer to or from on 
oddress which were determined by formulo colculotion. For exomple, o percentile 
wos a dotum from o sorted list. The position of the datum wos determined by formulo 
(1) poge 19. If a progrom wos designed to colculote percentiles for different 
sample s'zes, different addresses would hove to be referred to when copying rows ot 
oppropriote locotions to the percentiles ronge. It was difficult, if not impossible, to 
prepare 0 macro progrom within LOTUS 1-2-3 to go to the different oddresses on 
the basis of the number of cords in a somple, becouse cell address was a label 
rather thon a number. Every time the somple size chonged, some oddresses in the 
mocro progrom GENERAL hod to be chonged occordingly. 
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C h 0 p I e r j i x 
CONCLUSION 




This paper introduced some typical applications of microcomputers in survey 
data processing. The experience confirmed that if planned and designed properly, 
microcomputers can be used very effectively in the processing of large amounts of 
data. 

Different possibilities and approaches are available on the 
microcomputer. Some people may prefer to write routines, programs or entire 
software packages in one of the microcomputer languages such as BASIC or PASCAL 
for the processing of specific data. Others may opt for the use of some of the 
ready-mode software packoges for database management or spreadsheet 
operations. 

Current trend shows that database management software systems tend to 
dominate. The experience described in this paper on the use of LOTUS 1-2-3, a 
typical spreadsheet software package, therefore added another dimension to 
microcomputer applications in educational management information. In many 
woys, the decision to use LOTUS 1-2- i could be attributed to the varied analytical 
features offered by this softwore. such as functions for the calculation of basic 
statistics and regression analysis. A first attempt has been made to make use of 
these features, through the macro programming language within LOTUS 1-2-3. The 
results hcye proved to be very positive and satisfactory. 

There is a growing tendency for most software packages to make extensive 
use of user-friendly menu-driven type of operational features. Many packages also 
hove expanded features for customizing data processing procedures, by incorporating 
progromming facilities, such as the macro programming language within LOTUS 1-2-3 
or the command files under DBASE III. It is expected that such facilities will further 
develop in the future. As microcomputers are becoming more accessible and easier 
to use, efforts on the part of statisticians, managers and planners of education 
to participate in the development of more powerful and multi-function programs for 
the entry, checking, storage, processing, search, retrievol. analysis and 
dissemination of education management information, will be required. 
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It was noticed during the data entry stage of this survey that as the more dato 
were being handled, the more frequently errors occurred. Consequently, more in- 
tensive data checking and corrections were required. The design of the survey ques- 
tionnaire was found to piay an essential role in facilitating data verification and en- 
try. Iri some countries, optical- marker readers (OMR) have been used for inputting 
I data directly from the questionnaires into computerized storage. Although this method 

!r 's still relatively costly, it is seen as one of the means by which the effectiveness and 

%l efficiency of data processing can be enhanced. 
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Annex i i 



COMPARE PROGRAMME 



1 
2 
3 
4 

5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 



AB AC AD AE 
REM: COMPARE PROGRAM 

/rea!.z640~ 

/wwc 

|qoto|al~ 

/fcc|?}~|esc| |esc| 
/wgpd 
|golo(c3~ 
/•/rtb 

/cal.x309~a321~ 
|Goto|c323~ 
/fcsncaa~|?|~ 
|goto|y323~ 

@abs{c323)+@abs{d323)+..,+@abs{m323)~ 
jrightj 

@abs{n323)+@ubs{o323)+,..+@ab3{x3?3)~ 
jrighlj 

+Y323+Z323~ 

/cy323.aa323'vy324.y630~ 

|up| 

jdownj |down| |down| Idownj |down| |down| |down| |down| jdown^ 

/wwhJhome( 

/wgrm 

>window| 

:M [upl |up| |up| |up| |up| |up| jupj lop( jdownl 

/xi@dcounl {aa322.aa630,0,crileria)=0~|home|0K~/xq 
/ dqiaa322.aa630~cab35.ab36~f 
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Annex III 



GENERAL PROGRAMME 



AC 

1 counter 1 

2 counter 



AO 



AE 



AF 



3 


\a 


4 




5 




6 




7 




8 
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loop 


10 




11 




12 




13 




14 




15 




16 


loopi 


17 




18 




19 




20 




21 




22 




23 




24 




25 




26 




27 




28 




29 




30 




31 




32 




33 




34 




35 





21 



/wgtm 

|qoto|G320~/fccecompute~ 
/reGl.x310~/rec322.x339~/caa3~aG3.GG308~ 
goto jG31 0~g~/cg31 0~g31 0.x31 0~|goto|G 1 ~/fcce 
goto counter~0~ 

goto|c3~/dsdb3.b308~pb3.b308~a~q 
/dfcounter~counter+ 1 ~1 ~~ 
/xicounter<21~/xgad12~ 
/xgad14~ 

/dsdjescj |right|.|end| |down|~p |esc| |right|.|end| jdownj 

/xgloop~ 

/rncvGlue~c321~ 

/rncdGtG~c3~Aecounter1~ 

/dfcounter1~counter1+^ 1~~ 

|goto|dGtG~/c|end| |down|~z3~ 

/ ■ncdGtG~|right|~ 

|cgIc| 

/fxvqGZ~Gb3.Gb10~r 

|goto|value~/fcceqGz~ 

/ ■ncvGlue~Jright|~ 

/ xicounter 1 <2 1 ~/xgloop1 ~ 

/cc11.x11~c329~ 

/cc17.x17~c330~ 

/cc33.x33~c331~ 

/cc48.x48~c332~ 

/cc79.x79~c333~ 

/cc156.x156~c334~ 

/cc232.x232~c335~ 

/cc248.x248~c336~ 

/cc278.x278~c337~ 

/cc294.x294~c338~ 

/cc300.x300~c339- 

/fxw|?|~a321.z343~ 
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